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IN THE CLAIMS 

Please amend Claims 2, 42, 44-45, 47-50, 55-58, 60-63, 65, 67, 74-75, 79, 81, 85-86, 92, 
98-99, 105, 107, 111, 112, 121-123, 127-135 as follows: 

5 

1. (Cancelled) 

2. (Currently amended) For use in a system involving an Embedded-DRAM 
processor, a method for intelligent caching comprising the steps of: 

splitting an architecture into first and second portions, said first portion comprising a set 
10 of functional units and a set of architectural registers exercised thereby, said second portion 
comprising at least one functional unit capable of moving data between a main memory 
implemented as one or more banks of DRAM without a caching system that employs cache hits 
and cache misses, and said set of architectural registers; and 

splitting a single program into first and second concurrently executing portions which 
1 5 each concurrently execute distinct subsets of parallely dispatched instructions from one or more 
instruction streams, said first portion of said program executed on said first portion of the 
architecture, said second portion of said program executed on said second portion of said 
architecture; 

wherein said second portion of said architecture is operative to prefetch data from said 
20 main memory into said architectural registers prior to being processed by said first portion of said 
architecture, and wherein said second portion of said architecture is operative to move results 
produced by said first portion of said architecture into main memory after they are produced by 
said first portion of said architecture; and 

prior to when said first portion of said architecture executes a conditional branch 
25 instruction having a condition , said second portion of said architecture prefetches first and 

second data sets from memory into said architectural registers, said first data set being needed for 
use as instruction operands when said condition evaluates to true, said second data set being 
needed for use as instruction operands when said condition evaluates to false. 

3. -41. (Cancelled) 

30 42. (Currently amended) The method of Claim 2, wherein speculative prefetching of 

data is performed from said main memory so that the first portion of said program need not wait 
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for the first or the second data set to be fetched from said main memory, irrespective of the 
outcome of the conditional instruction. 

43. (Cancelled) 

44. (Currently amended) The method of Claim 2, wherein the second portion of said 
5 architecture generates a row precharge instruction to precharge a DRAM row of a bank of the 

one or more banks of DRAM to cause data to be ready prior to issuing a read command. 

45. (Currently amended) The method of Claim 2, wherein the set of architectural 
registers comprises a register file comprising a plurality of registers and having a parallel access 
port operative to load or store, under control of the second portion of said program , contents of 

10 said register file in a single DRAM access cycle from or to a DRAM row of a bank of the one or 
more banks of DRAM. 

46. (Previously presented) The method of Claim 45, wherein the load operation is 
performed with a mask to allow certain of the contents of selected registers of the register file not 
to be modified by the load operation. 

15 47. (Currently amended) The method of Claim 45, wherein said register file further 

comprises at least a second access port operative to transfer data between one or more of the 
functional units of the first portion of said architecture . 

48. (Currently amended) The method of Claim 45, wherein the first portion and the 
second portion of said architecture cooperate to execute a DRAM row selected by a row-address 

20 register, and said register file further comprises at least a second access port operative to transfer 
data between one or more of the functional units of the first portion of said architecture . 

49. (Currently amended) The method of Claim 45, wherein the register file can be placed 
into an inactive state where the register file does not appear in the a register space of the 
functional units of the first portion of said architecture . 

25 50. (Currently amended) The method of Claim 45, wherein when the register file is 

placed into the an inactive state, the second portion of the architecture is enabled to cause a 
parallel load or store operation to occur between the parallel access port and a row of DRAM in a 
bank of the one or more banks of DRAM. 
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51. (Previously presented) The method of Claim 2, wherein the first and second portions 
of said architecture cooperatively execute instructions to process image data for display. 

52. (Previously presented) The method of Claim 2, wherein the first and second portions 
of said architecture cooperatively execute instructions to process video data for display. 

5 53. (Previously presented) The method of Claim 2, wherein the first and second portions 

of said architecture cooperatively execute instructions to perform digital filtering operations. 

54. (Previously presented) The method of Claim 2, wherein the first and second portions 
of said architecture cooperatively execute instructions to execute a video decoder algorithm. 

55. (Currently amended) For use in a system involving an Embedded-DRAM 
10 processor, a method for intelligent caching comprising: 

providing an architecture split into first and second portions, said first portion comprising 
a set of functional units and a set of architectural registers exercised thereby, said second portion 
comprising at least one functional unit capable of moving data between a main memory 
implemented as one or more banks of DRAM, and said set of architectural registers, wherein the 
15 second portion accesses main memory without a caching system that employs cache hits and 
cache misses; and 

providing a single program split into first and second concurrently executing portions 
which each concurrently execute distinct subsets of parallely dispatched instructions from one or 
more instruction streams, said first portion of said program executed on said first portion of the 
20 architecture, said second portion of said program executed on said second portion of said 
architecture; 

wherein said second portion of said architecture is operative to prefetch data from said 
main memory into one or more of said architectural registers prior to being processed by said 
first portion of said architecture, and wherein said second portion of said architecture is operative 

25 to move results produced by said first portion of said architecture into main memory after they 
are produced by said first portion of said architecture; and 

prior to when said first portion of said architecture executes a conditional branch 
instruction having a condition , said second portion of said architecture prefetches first and 
second data sets from said main memory into said architectural registers, said first data set being 

30 needed for use as instruction operands when said condition evaluates to true, said second data set 
being needed for use as instruction operands when said condition evaluates to false. 
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56. (Currently amended) The method of Claim 55, wherein speculative prefetching 
of data is performed from said main memory so that the first portion of said program need not 
wait for the first or the second data set to be fetched from said main memory, irrespective of the 
outcome of the conditional instruction. 

5 57. (Currently amended) The method of Claim 55, wherein the second portion of 

said architecture generates a row precharge instruction to precharge a DRAM row of a bank of 
the one or more banks of DRAM to cause data to be ready prior to issuing a read command. 

58. (Currently amended) The method of Claim 55, wherein the set of architectural 
registers comprises a register file comprising a plurality of registers and having a parallel access 

10 port operative to load or store, under control of the second portion of said program , contents of 
said register file in a single DRAM access cycle from or to a DRAM row of a bank of the one or 
more banks of DRAM. 

59. (Previously presented) The method of Claim 58, wherein the load operation is 
performed with a mask to allow certain of the contents of selected registers of the register file not 

15 to be modified by the load operation. 

60. (Currently amended) The method of Claim 58, wherein said register file further 
comprises at least a second access port operative to transfer data between one or more of the 
functional units of the first portion of said architecture . 

61. (Currently amended) The method of Claim 58, wherein the first portion and the 

20 second portion of said architecture cooperate to execute a DRAM row selected by a row-address 
register, and said register file further comprises at least a second access port operative to transfer 
data between one or more of the functional units of the first portion of said architecture . 

62. (Currently amended) The method of Claim 58, wherein the register file can be placed 
into an inactive state where the register file does not appear in the a register space of the 

25 functional units of the first portion. 

63. (Currently amended) The method of Claim 58, wherein when the register file is 
placed into the an inactive state, the second portion of said architecture is enabled to cause a 
parallel load or store operation to occur between the parallel access port and a row of DRAM in a 
bank of the one or more banks of DRAM. 
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64. (Previously presented) The method of Claim 55, wherein the first and second portions 
of said architecture cooperatively execute instructions to process at least one of image data and 
video data, for display. 

65. (Currently amended) The method of Claim 55, wherein the first and second portions 
5 of said program cooperatively execute instructions to perform digital filtering operations. 

66. (Previously presented) The method of Claim 55, wherein when the first and second 
portions of said architecture cooperatively execute instructions to execute a video decoder 
algorithm. 



1 0 caching, comprising: 

providing an architecture comprising first and second portions, said first portion 
comprising a set of functional units and a set of architectural registers exercised thereby, said 
second portion comprising at least one functional unit capable of moving data between a main 
memory implemented substantially as one or more banks of DRAM, and said set of architectural 

1 5 registers, wherein the second portion accesses main memory without a caching system that 
employs cache hits and cache misses; and 

providing a program comprising first and second program portions which each 
concurrently execute subsets of parallely dispatched instructions from one or more instruction 
streams, said first program portion executed on said first portion of the architecture, said second 

20 program portion executed on said second portion of said architecture; 

wherein said second portion of said architecture is operative to prefetch data from said 
main memory into one or more of said architectural registers prior to being processed by said 
first portion of said architecture, and wherein said second portion of said architecture is operative 
to move results produced by said first portion of said architecture into main memory after they 

25 are produced by said first portion of said architecture; and 

prior to when said first portion of said architecture executes a conditional instruction 
having a condition , said second portion of said architecture prefetches first and second data sets 
from said main memory into said architectural registers, said first data set being needed for use as 
instruction operands when said condition is true, said second data set being needed for use as 

30 instruction operands when said condition is false. 



67. (Currently amended) 



In an embedded-DRAM processor, a method for intelligent 
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68. (Previously presented) The method of Claim 67, wherein speculative prefetching 
of data is performed from said main memory so that the first program portion need not wait for 
the first or the second data set to be fetched from said main memory, irrespective of the outcome 
of the conditional instruction. 

5 69. (Previously presented) The method of Claim 67, wherein the second portion 

generates a row precharge instruction to precharge a DRAM row of a bank of the one or more 
banks of DRAM to cause data to be ready prior to issuing a read command. 

70. (Previously presented) The method of Claim 67, wherein the set of architectural 
registers comprises a register file comprising a plurality of registers and having a parallel access 

10 port operative to load or store, under control of the second portion, contents of said register file 
in a single DRAM access cycle from or to a DRAM row of a bank of the one or more banks of 
DRAM. 

71. (Previously presented) The method of Claim 70, wherein the load operation is 
performed with a mask to allow certain of the contents of selected registers of the register file not 

15 to be modified by the load operation. 

72. (Previously presented) The method of Claim 70, wherein said register file further 
comprises at least a second access port operative to transfer data between one or more of the 
functional units of the first portion. 

73. (Previously presented) The method of Claim 70, wherein the first portion and the 

20 second portion of said architecture cooperate to execute a DRAM row selected by a row-address 
register, and said register file further comprises at least a second access port operative to transfer 
data between one or more of the functional units of the first portion. 

74. (Currently amended) The method of Claim 70, wherein the register file can be placed 
into an inactive state where the register file does not appear in the a register space of the 

25 functional units of the first portion. 

75. (Currently amended) The method of Claim 70, wherein when the register file is 
placed into the an inactive state, the second portion is enabled to cause a parallel load or store 
operation to occur between the parallel access port and a row of DRAM in a bank of the one or 
more banks of DRAM. 
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76. (Previously presented) The method of Claim 67, wherein the first and second portions 
of said architecture cooperatively execute instructions to process at least one of image data or 
video data for display. 

77. (Previously presented) The method of Claim 67, wherein the first and second portions 
5 of said architecture cooperatively execute instructions to perform digital filtering operations. 

78. (Previously presented) The method of Claim 67, wherein when the first and second 
portions of said architecture cooperatively execute instructions to execute a video decoder 
algorithm. 



10 intelligent caching comprising: 

utilizing an architecture comprising first and second portions, said first portion 
comprising a set of functional units and a set of architectural registers exercised thereby, said 
second portion comprising at least one functional unit capable of moving data between a main 
memory implemented as one or more banks of RAM, and said set of architectural registers, 

15 wherein the second portion accesses main memory without a caching system that employs cache 
hits and cache misses; and 

utilizing a single program having first and second concurrently executing portions to each 
concurrently execute distinct subsets of parallely dispatched instructions from one or more 
instruction streams, said first portion of said program executed on said first portion of the 

20 architecture, said second portion of said program executed on said second portion of said 
architecture; 

wherein said second portion of said architecture is operative to fetch data in said main 
memory prior to being loaded into said first portion of said architecture, and wherein said second 
portion of said architecture is operative to move results produced by said first portion of said 
25 architecture into main memory after they are produced by said first portion of said architecture; 
and 

prior to when said first portion of said architecture executes a conditional instruction 
having a condition , said second portion of said architecture precharges first and second data sets 
in respective RAM rows, said first data set being needed for use as instruction operands when 
30 said condition is true, said second data set being needed for use as instruction operands when 
said condition is false. 



79. (Currently amended) 



In an Embedded-RAM processing apparatus, a method for 
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80. (Previously presented) The method of Claim 79, wherein the second portion of 
said architecture generates a row precharge instruction to precharge a RAM row of a bank of the 
one or more banks of RAM to cause data to be ready prior to issuing a read command. 

81. (Currently amended) The method of Claim 79, wherein the set of architectural 

5 registers comprises a register file comprising a plurality of registers and having a parallel access 
port operative to load or store, under control of the second portion of said program , contents of 
said register file in a single RAM access cycle from or to a RAM row of a bank of the one or 
more banks of RAM. 

82. (Previously presented) The method of Claim 81, wherein at least one load operation is 
10 performed with a mask to allow certain of the contents of selected registers of the register file not 

to be modified by the at least one load operation. 

83. (Previously presented) The method of Claim 81, wherein said register file further 
comprises at least a second access port operative to transfer data between one or more of the 
functional units of the first portion. 

1 5 84. (Previously presented) The method of Claim 8 1 , wherein the first portion and the 

second portion of said architecture cooperate to execute a RAM row selected by a row-address 
register, and said register file further comprises at least a second access port operative to transfer 
data between one or more of the functional units of the first portion. 

85. (Currently amended) The method of Claim 81, wherein the register file can be placed 
20 into an inactive state where the register file does not appear in the a register space of the 

functional units of the first portion. 

86. (Currently amended) The method of Claim 81, wherein when the register file is 
placed into the an inactive state, the second portion of said architecture is enabled to cause a 
parallel load or store operation to occur between the parallel access port and a row of RAM in a 

25 bank of the one or more banks of RAM. 

87. (Previously presented) The method of Claim 79, wherein the first and second portions 
of said architecture cooperatively execute instructions to process at least one of image data or 
video data for display. 

88. (Previously presented) The method of Claim 79, wherein the first and second portions 
30 of said architecture cooperatively execute instructions to perform digital filtering operations. 
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89. (Previously presented) The method of Claim 79, wherein when the first and second 
portions of said architecture cooperatively execute instructions to execute a video decoder 
algorithm. 

90. (Previously presented) The method of Claim 79, wherein said RAM comprises scroll- 

5 RAM. 

91. (Previously presented) The method of Claim 79, wherein said RAM comprises 
synchronous DRAM (SDRAM). 

92. (Currently amended) In an Embedded-RAM processing apparatus, a method for 
intelligent caching comprising: 

10 providing an architecture comprising first and second portions, said first portion 

comprising a set of functional units and a set of architectural registers exercised thereby, said 
second portion comprising at least one functional unit capable of moving data between a main 
memory implemented as one or more banks of RAM, and said set of architectural registers, 
wherein the second portion accesses main memory without a caching system that employs cache 

1 5 hits and cache misses; and 

utilizing a program comprising first and second program portions to each concurrently 
execute distinct subsets of parallely dispatched instructions from one or more instruction streams, 
said first program portion executed on said first portion of the architecture, said second program 
portion executed on said second portion of said architecture; 

20 wherein said second portion of said architecture is operative to prefetch data from said 

main memory and to pass it into one or more of said architectural registers prior to being 
processed by said first portion of said architecture, and wherein said second portion of said 
architecture is operative to move results produced by said first portion of said architecture into 
main memory after they are produced by said first portion of said architecture; and 

25 prior to when said first portion of said architecture executes a conditional instruction 

having a condition , said second portion of said architecture prefetches first and second data sets 
from said main memory into said architectural registers, said first data set being needed for use as 
instruction operands when said condition is true, said second data set being needed for use as 
instruction operands when said condition is false. 
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93. (Previously presented) The method of Claim 92, wherein the second portion of 
said architecture generates a row precharge instruction to precharge a RAM row of a bank of the 
one or more banks of RAM to cause data to be ready prior to issuing a read command. 

94. (Previously presented) The method of Claim 92, wherein the set of architectural 

5 registers comprises a register file comprising a plurality of registers and having a parallel access 
port operative to load or store, under control of the second portion, contents of said register file 
in a single RAM access cycle from or to a RAM row of a bank of the one or more banks of 
RAM. 

95. (Previously presented) The method of Claim 94, wherein at least one load operation is 
10 performed with a mask to allow certain of the contents of selected registers of the register file not 

to be modified by the at least one load operation. 

96. (Previously presented) The method of Claim 94, wherein said register file further 
comprises at least a second access port operative to transfer data between one or more of the 
functional units of the first portion. 

15 97. (Previously presented) The method of Claim 94, wherein the first portion and the 

second portion of said architecture cooperate to execute a RAM row selected by a row-address 
register, and said register file further comprises at least a second access port operative to transfer 
data between one or more of the functional units of the first portion. 

98. (Currently amended) The method of Claim 94, wherein the register file can be placed 
20 into an inactive state where the register file does not appear in the a register space of the 

functional units of the first portion. 

99. (Currently amended) The method of Claim 94, wherein when the register file is 
placed into the an inactive state, the second portion is enabled to cause a parallel load or store 
operation to occur between the parallel access port and a row of RAM in a bank of the one or 

25 more banks of RAM. 

100. (Previously presented) The method of Claim 92, wherein the first and second 
portions of said architecture cooperatively execute instructions to process at least one of image 
data or video data for display. 
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101. (Previously presented) The method of Claim 92, wherein the first and second 
portions of said architecture cooperatively execute instructions to perform digital filtering 
operations. 

102. (Previously presented) The method of Claim 92, wherein when the first and second 
5 portions of said architecture cooperatively execute instructions to execute a video decoder 

algorithm. 

103. (Previously presented) The method of Claim 92, wherein said RAM comprises 
scroll-RAM. 

104. (Previously presented) The method of Claim 92, wherein said RAM comprises 
1 0 synchronous DRAM (SDRAM). 

105. (Currently amended) In an Embedded-DRAM processor, a method for caching 
comprising: 

providing an architecture comprising first and second portions, said first portion 
comprising a set of functional units and a set of architectural registers exercised thereby, said 
15 second portion comprising at least one functional unit capable of moving data between a main 
memory implemented as one or more banks of DRAM, and said set of architectural registers, the 
second portion being capable of accessing main memory without a caching system that employs 
cache hits or cache misses; and 

utilizing a program comprising first and second program portions which each 
20 concurrently execute distinct subsets of parallely dispatched instructions from one or more 

instruction streams, said first program portion executed on said first portion of the architecture, 
said second program portion executed on said second portion of said architecture; 

wherein said second portion of said architecture is operative to prefetch data from said 
main memory and to pass it to one or more of said registers prior to being processed by said first 
25 portion of said architecture, and wherein said second portion of said architecture is operative to 
move results produced by said first portion of said architecture to frame buffer memory after they 
are produced by said first portion of said architecture; and 

prior to when said first portion of said architecture executes a conditional instruction 
having a condition , said second portion of said architecture prefetches first and second data sets 
30 from main memory into said registers, said first data set being needed for use as instruction 
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operands when said condition is true, said second data set being needed for use as instruction 
operands when said condition is false. 

106. (Previously presented) The method of Claim 105, wherein the second portion of 
said architecture generates a row precharge instruction to precharge a DRAM row of a bank of 

5 the one or more banks of DRAM to cause data to be ready prior to issuing a read command. 

107. (Currently amended) The method of Claim 105, wherein the set of architectural 
registers comprises a register file comprising a plurality of registers and having a parallel access 
port operative to load or store, under control of the second portion of said program , contents of 
said register file in a single DRAM access cycle from or to a DRAM row of a bank of the one or 

1 0 more banks of DRAM. 

108. (Previously presented) The method of Claim 107, wherein at least one load operation 
is performed with a mask to allow certain of the contents of selected registers of the register file 
not to be modified by the at least one load operation. 

109. (Previously presented) The method of Claim 107, wherein said register file further 
1 5 comprises at least a second access port operative to transfer data between one or more of the 

functional units of the first portion. 

110. (Previously presented) The method of Claim 107, wherein the first portion and the 
second portion of said architecture cooperate to execute a RAM row selected by a row-address 
register, and said register file further comprises at least a second access port operative to transfer 

20 data between one or more of the functional units of the first portion. 

111. (Currently amended) The method of Claim 107, wherein the register file can be 
placed into an inactive state where the register file does not appear in the a register space of the 
functional units of the first portion. 

1 12. (Currently amended) The method of Claim 107, wherein when the register file is 
25 placed into the an inactive state, the second portion is enabled to cause a parallel load or store 

operation to occur between the parallel access port and a row of RAM in a bank of the one or 
more banks of RAM. 
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113 . (Previously presented) The method of Claim 105, wherein the first and second 
portions of said architecture cooperatively execute instructions to process at least one of image 
data or video data for display. 

114. (Previously presented) The method of Claim 105, wherein the first and second 
5 portions of said architecture cooperatively execute instructions to perform digital filtering 

operations. 

115. (Previously presented) The method of Claim 105, wherein when the first and second 
portions of said architecture cooperatively execute instructions to execute a video decoder 
algorithm. 

10 116. (Previously presented) For use in a system involving an Embedded-DRAM 

processor, a method for caching comprising: 

providing an architecture comprising first and second portions, said first portion 
comprising a set of functional units and a set of architectural registers exercised thereby, said 
second portion comprising at least one functional unit capable of moving data between a main 
15 memory implemented as one or more banks of DRAM, and said set of architectural registers, the 
second portion accessing main memory without a caching system that employs cache hits and 
cache misses; and 

providing a program comprising first and second program portions concurrently 
executing respective ones of distinct subsets of parallely dispatched instructions from one or 
20 more instruction streams, said first program portion executing on said first portion of the 

architecture, said second program portion executing on said second portion of said architecture; 

wherein said second portion of said architecture is operative to prefetch data from said 
main memory and to pass it into one or more of said architectural registers prior to processing by 
said first portion of said architecture, and wherein said second portion of said architecture is 
25 operative to move results produced by said first portion of said architecture into a frame buffer 
memory after they are produced by said first portion of said architecture; and 

prior to when said first portion of said architecture executes an instruction, said second 
portion of said architecture prefetching first and second data sets from main memory into said 
architectural registers, said first and second data sets being needed for use as instruction 
30 operands. 
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1 1 7. (Previously presented) The method of Claim 1 1 6, wherein the second portion of 
said architecture generates a row precharge instruction to precharge a DRAM row of a bank of 
the one or more banks of DRAM to cause data to be ready prior to issuing a read command. 

118. (Previously presented) The method of Claim 1 16, wherein the set of architectural 
5 registers comprises a register file comprising a plurality of registers and having a parallel access 

port operative to load or store, under control of the second portion, contents of said register file 
in a single DRAM access cycle from or to a DRAM row of a bank of the one or more banks of 
DRAM. 

1 19. (Previously presented) The method of Claim 118, wherein at least one load operation 
10 is performed with a mask to allow certain of the contents of selected registers of the register file 

not to be modified by the at least one load operation. 

120. (Previously presented) The method of Claim 118, wherein said register file further 
comprises at least a second access port operative to transfer data between one or more of the 
functional units of the first portion. 

15 121. (Currently amended) The method of Claim 118, wherein th e first portion and the 

s e cond portion of said archit e ctur e coop e rat e to e x e cut e a RAM row s e l e ct e d by a row addr e ss 
r e gist e r, and said register file further comprises at least a second access port operative to transfer 
data between one or more of the functional units of the first portion of said architecture . 

122. (Currently amended) The method of Claim 118, wherein the register file can be 
20 placed into an inactive state where the register file does not appear in the a register space of the 

functional units of the first portion. 

123. (Currently amended) The method of Claim 118, wherein when the register file is 
placed into the an inactive state, the second portion is enabled to cause a parallel load or store 
operation to occur between the parallel access port and a row of RAM in a bank of the one or 

25 more banks of RAM. 

124. (Previously presented) The method of Claim 1 16, wherein the first and second 
portions of said architecture cooperatively execute instructions to process at least one of image 
data or video data for display. 
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125. (Previously presented) The method of Claim 116, wherein the first and second 
portions of said architecture cooperatively execute instructions to perform digital filtering 
operations. 

126. (Previously presented) The method of Claim 1 16, wherein when the first and second 
5 portions of said architecture cooperatively execute instructions to execute a video decoder 

algorithm. 

127. (Currently amended) For use in a system involving an Embedded-DRAM 
processor, said system comprising (i) an architecture comprising first and second portions, said 
first portion comprising a set of functional units and a set of architectural registers exercised 

10 thereby, said second portion comprising at least one functional unit capable of moving data 
between a main memory implemented as one or more banks of DRAM, and said set of 
architectural registers, wherein the second portion accesses main memory without a caching 
system that employs cache hits and cache misses, and (ii) a program comprising first and second 
program portions which each concurrently execute distinct subsets of parallely dispatched 

1 5 instructions from one or more instruction streams, said first program portion executed on said 
first portion of the architecture, said second program portion executed on said second portion of 
said architecture, a method for intelligent caching comprising: 

prefetching, using at least said second portion of said architecture, data from said main 
memory into one or more of said architectural registers prior to being processed by said first 

20 portion of said architecture, and wherein said second portion of said architecture is operative to 
move results produced by said first portion of said architecture into main memory after they are 
produced by said first portion of said architecture; and 

prefetching, using at least said second portion of said architecture and prior to when said 
first portion of said architecture executes a conditional branch instruction having a condition , 

25 first and second data sets from said main memory into said architectural registers, said first data 
set being needed for use as instruction operands when said condition is true, said second data set 
being needed for use as instruction operands when said condition is false. 

128. (Currently amended) For use in a system involving an Embedded-RAM 
processor, said system comprising (i) an architecture comprising first and second portions, said 

30 first portion comprising a set of functional units and a set of architectural registers exercised 
thereby, said second portion comprising at least one functional unit capable of moving data 
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between a main memory implemented as one or more banks of RAM, and said set of 
architectural registers, wherein the second portion accesses main memory without a caching 
system that employs cache hits and cache misses, and (ii) a program comprising first and second 
program portions which each concurrently execute distinct subsets of parallely dispatched 
5 instructions from one or more instruction streams, said first program portion executed on said 
first portion of the architecture, said second program portion executed on said second portion of 
said architecture, a method for intelligent caching comprising: 

fetching, using at least said second portion of said architecture, data in said main memory 
prior to being loaded into said first portion of said architecture, and wherein said second portion 

10 of said architecture is operative to move results produced by said first portion of said architecture 
into main memory after they are produced by said first portion of said architecture; and 

precharging, using at least said second portion of said architecture, and prior to when said 
first portion of said architecture executes a conditional instruction having a condition , first and 
second data sets in respective RAM rows, said first data set being needed for use as instruction 

15 operands when said condition is true, said second data set being needed for use as instruction 
operands when said condition is false. 

129. (Currently amended) In an Embedded-RAM processing apparatus comprising (i) 
an architecture comprising first and second portions, said first portion comprising a set of 
functional units and a set of architectural registers exercised thereby, said second portion 

20 comprising at least one functional unit capable of moving data b e tween a at least a portion of the 
contents of a main memory implemented as one or more banks of RAM^and to said set of 
architectural registers, wherein the second portion accesses main memory without a caching 
system that employs cache hits and cache misses, and (ii) a program comprising first and second 
program portions which concurrently execute distinct subsets of parallely dispatched instructions 

25 from one or more instruction streams, said first program portion executed on said first portion of 
the architecture, said second program portion executed on said second portion of said 
architecture, a method for intelligent caching comprising: 

fetching, using at least said second portion of said architecture, data in said main memory 
prior to said data being passed to one or more of said architectural registers prior to said data . 

30 being processed by said first portion of said architecture, and wherein said second portion of said 
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architecture is operative to move results produced by said first portion of said architecture into 
said main memory after they are produced by said first portion of said architecture; and 

precharging, using at least said second portion of said architecture, and prior to when said 
first portion of said architecture executes a conditional instruction having a condition , from said 
5 main memory into said architectural registers, first and second data sets, said first data set being 
needed for use as instruction operands when said condition is true, said second data set being 
needed for use as instruction operands when said condition is false. 

130. (Currently amended) In an Embedded-DRAM processing apparatus comprising 
(i) an architecture split into first and second portions, said first portion comprising a set of 

10 functional units and a set of architectural registers exercised thereby, said second portion 

comprising at least one functional unit capable of moving data b e tw ee n a at least a portion of the 
contents of a main memory implemented as one or more banks of DRAM , and to said set of 
architectural registers, wherein the second portion accesses main memory without a caching 
system that employs cache hits and cache misses, and (ii) a program comprising first and second 

1 5 program portions which concurrently execute distinct subsets of parallely dispatched instructions 
from one or more instruction streams, said first program portion executed on. said first portion of 
the architecture, said second program portion executed on said second portion of said 
architecture, a method for intelligent caching comprising: 

fetching, using at least said second portion of said architecture, data in said main memory 

20 prior to said data being passed to one or more of said architectural registers prior to said data 

being processed by said first portion of said architecture, and wherein said second portion of said 
architecture is operative to move results produced by said first portion of said architecture into a 
frame buffer memory after they are produced by said first portion of said architecture; and 

charging, using at least said second portion of said architecture, and prior to when said 

25 first portion of said architecture executes a conditional instruction, from said main memory into 
said architectural registers, first and second data sets, said first and second data sets being needed 
for use as instruction operands. 

131. (Currently amended) In an Embedded-DRAM processing apparatus comprising 
(i) an architecture comprising first and second portions, said first portion comprising a set of 
30 functional units and a set of architectural registers exercised thereby, said second portion 

comprising at least one functional unit capable of moving data b e tween a at least a portion of the 
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contents of a main memory implemented as one or more banks of DRAMrand to said set of 
architectural registers, wherein the second portion accesses main memory without a caching 
system that employs cache hits and cache misses, and (ii) a program comprising first and second 
program portions which concurrently execute distinct subsets of parallely dispatched instructions 
5 from one or more instruction streams, said first program portion executed on said first portion of 
the architecture, said second program portion executed on said second portion of said 
architecture, a method for intelligent caching comprising: 

fetching, using at least said second portion of said architecture, data in said main memory 
prior to said data being passed to one or more of said architectural registers prior to said data 

10 being processed by said first portion of said architecture, and wherein said second portion of said 
architecture is operative to move results produced by said first portion of said architecture into 
said main memory after they are produced by said first portion of said architecture; and 

charging, using at least said second portion of said architecture, and prior to when said 
first portion of said architecture executes a conditional instruction having a condition , in 

15 respective DRAM rows, first and second data sets, said first data set. being needed for use as 
instruction operands when said condition is true, said second data set being needed for use as 
instruction operands when said condition is false. 

132. (Currently amended) In an Embedded-DRAM processing apparatus comprising 
(i) an architecture comprising first and second portions, said first portion comprising a set of 

20 functional units and a set of architectural registers exercised thereby, said second portion 

comprising at least one functional unit capable of moving data b e tween a at least a portion of the 
contents of a main memory implemented as one or more banks of DRAMrand to said set of 
registers, wherein the second portion accesses main memory without a caching system that 
employs cache hits and cache misses, and (ii) a program comprising first and second program 

25 portions which concurrently execute distinct subsets of parallely dispatched instructions from one 
or more instruction streams, said first program portion executed on said first portion of the 
architecture, said second program portion executed on said second portion of said architecture, a 
method for intelligent caching comprising: 

fetching, using at least said second portion of said architecture, data in said main memory 

30 prior to said data being passed to one or more of said registers prior to said data being processed 
by said first portion of said architecture, and wherein said second portion of said architecture is 
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operative to move results produced by said first portion of said architecture into said main 
memory after they are produced by said first portion of said architecture; and 

prefetching, using at least said second portion of said architecture, and prior to when said 
first portion of said architecture executes a conditional instruction, in respective DRAM rows, 
5 first and second data sets into said registers, said first and second data sets being needed for use 
as operands. 

133. (Currently amended) For use in an Embedded-DRAM processing apparatus (i) 
first architecture means and second architecture means, said first means comprising a set of 
functional means and a set of architectural register means exercised thereby, said second 

10 architecture means comprising at least one functional means capable of moving data b e tw ee n at 
least a portion of the contents of one or more banks of means for storing dat a, and to said set of 
register means, wherein the second architectural means accesses said means for storing data 
without a caching system that employs cache hits and cache misses, and (ii) a program 
comprising first and second program portions which each concurrently execute distinct subsets of 

15 parallely dispatched instructions from one or more instruction streams, said first program portion 
executed on said first architecture means, said second program portion executed on said second 
architecture means, a method for caching comprising: 

prefetching, using at least said second architecture means, data from said means for 
storing data into one or more of said register means prior to said data being processed by said 

20 first architecture means, and wherein said second architecture means is operative to move results 
produced by said first architecture means into said means for storing after they are produced by 
said first architecture means; and 

prefetching, using at least said second architecture means and prior to when said first 
architecture means executes a conditional instructio n having a condition , first and second data 

25 sets from said means for storing into said register means, said first data set being needed for use 
as operands when said condition is true, said second data set being needed for use as operands 
when said condition is false. 

134. (Currently amended) In an Embedded-RAM processing apparatus comprising (i) 
an architecture comprising first and second portions, said first portion comprising a set of 

30 functional means and a set of registers exercised thereby, said second portion comprising at least 
one functional means capable of at least moving data b e tw ee n contents of a main memory 
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implemented as one or more banks of RAM^and to said set of registers, wherein the second 
portion accesses main memory without a caching system that employs cache hits and cache 
misses, and (ii) a program comprising first and second program portions which concurrently 
execute distinct subsets of parallely dispatched instructions from one or more instruction streams, 
5 said first program portion executed on said first portion of the architecture, said second program 
portion executed on said second portion of said architecture, a method for caching comprising: 

fetching, using at least said second portion of said architecture, data in said main memory 
prior to said data being passed to one or more of said registers prior to said data being processed 
by said first portion of said architecture, and wherein said second portion of said architecture is 

10 operative to move results produced by said first portion of said architecture into said main 
memory after they are produced by said first portion of said architecture; and 

precharging first and second data sets into said registers, using at least said second portion 
of said architecture, and prior to when said first portion of said architecture executes a 
conditional instruction having a condition , said first data set being needed for use as operands 

15 when said condition is true, said second data set being needed for use as operands when said 
condition is false. 

135. (Currently amended) In an Embedded-RAM processing apparatus comprising (i) 
an architecture comprising first and second portions, said first portion comprising a set of 
functional means and a set of registers exercised thereby, said second portion comprising at least 

20 one functional means capable of at least moving data b e tw ee n contents of a main memory 
implemented as one or more banks of RAM^and to said set of registers, wherein the second 
portion accesses main memory without a caching system that employs cache hits and cache 
misses, and (ii) a program comprising first and second program portions which concurrently 
execute distinct subsets of parallely dispatched instructions from one or more instruction streams, 

25 said first program portion executed on said first portion of the architecture, said second program 
portion executed on said second portion of said architecture, a method for caching comprising: 

fetching, using at least said second portion of said architecture, data in said main memory 
prior to said data being passed to one or more of said registers prior to said data being processed 
by said first portion of said architecture, and wherein said second portion of said architecture is 

30 operative to move results produced by said first portion of said architecture into a frame buffer 
after they are produced by said first portion of said architecture; and 
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charging first and second data sets into said registers, using at least said second portion of 
said architecture, and prior to when said first portion of said architecture executes a conditional 
instruction, said first and second data sets being needed for use as operands. 

5 
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